-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable performing inferences for gRPC (grpcurl) #183
Conversation
Hi @ymoatti. Thanks for your PR. I'm waiting for a opendatahub-io member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
demo/kserve/custom-manifests/caikit/caikit-tgis-servingruntime-http.yaml
Outdated
Show resolved
Hide resolved
Thanks for your PR. This issue happens because we are moving forward :) Please see the inline comments:
This script is not for various situations. It is kind of a quick start like a demo so I don't see any reason to use generic value.
This should be a bug.
There are 2 ways to set Plus, I will review your PR. |
For item a, I think we can keep the original generic YAML but we should add the concrete Minio as a working example, especially given that the original scenario setup begins with an explicit example of setting up Minio for object storage. Item b - bug, fixed by the PR, and we further replace KSVC-based setup with ISVC-based setup Ok. Item c - bug, fixed by the PR. Agreed. Only point remaining - not sure what is JSON-style adding of storage-config. @Jooho Can you give an example? |
demo/kserve/custom-manifests/caikit/caikit-tgis-isvc-template.yaml
Outdated
Show resolved
Hide resolved
Folks, we made some progress with the JSON configuration but it's still not working and the year is ending, so this PR might have to delay quite a bit. Given that JSON config was not part of the original issues for this PR, can we agree to merge this PR using the SA-associated secret (as was the original code before the PR) just so that people can use the deploy-remove scenario without issues? This is already solved and can be reviewed and merged within a day's work. The new stuff, such as JSON config and using ISVC instead of KSVC to define the virtual host, can be deferred to a later PR. |
demo/kserve/custom-manifests/caikit/caikit-tgis-servingruntime-grpc.yaml
Show resolved
Hide resolved
demo/kserve/custom-manifests/caikit/caikit-tgis-servingruntime-grpc.yaml
Show resolved
Hide resolved
added a few nits, I'll give the last word to @Xaenalt |
…dition a generic yaml file
… caikit-tgis-servingruntime.yaml
…ments in ISVC template file
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dtrifiro, Xaenalt, ymoatti The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
The documentation and the scripts have multiple bugs which prevent using gRPC (grpcurl) for performing inferences and make difficult the usage of HTTP (curl).
This PR has for goal to fix the documentation, the scripts and related yaml files.
Description
Current documentation and scripts need to be fixed for HTTP (curl) as we have several problems such as:
a. Generic “storageUri” value in caikit-tgis-isvc.yaml instead of actual value
b. Error in how KSVC_HOSTNAME is computed in deploy-remove.md (specifies "caikit-example-isvc-predictor" instead of "caikit-tgis-example-isvc-predictor")
c. Missing serviceAccountName field in caikit-tgis-isvc.yaml
In addition, since Knative has no support for multiple ports managed by Knative Serving, we could not create ISVCs for more than a single protocol, thus rendering the usage of gRPC impossible with current scripts and documentation.
Our solution, beyond simple fixes of previously mentioned problems was to build a ServingRuntime specific for each protocol chosen with the adequate ports and details.
The user is required to specify the targeted protocol, either by mentioning its value as a variable (for step-to-step) or by passing it as an argument to the deploy_model script.
Similarly, the inference-call which replaces both http-call and grpc-call is passed the targeted protocol and will perform the inference accordingly.
We have chosen to well separate the HTTP versus gRPC implementations by having separate namespaces (resp. kserve-demo-http and kserve-demo-grpc). This permits to create/delete models for these protocols independently each of the other.
Finally, the launch of the metric service/monitor which appeared in the deploy-model.sh script but not in the step-by-step instructions have been commented out in deploy-model.sh. If judged necessary, they can be reactivated in which case the step-by-step documentation should also be modified.
How Has This Been Tested?
We first applied the script-based installation of Kserve and dependencies which was not changed.
On that basis we performed many possible combinations of:
manual deployments of models (http or grpc)
scripted deployments models (http or grpc)
manual invocations of inferences (http or grpc)
scripted invocations of inferences (http or grpc)
scripted deletion of models (http or grpc)
As said, we tried various combinations of the basic possible actions as mentioned above and checked the expected outcome (either success of inference invocation or disappearance of targeted namespace / artifacts for model deletion).
The main platform on which we tested was AWS ROSA.
Other areas of the code should not be affected since we only change documentation and scripts.
Merge criteria: